DOCUMENT RESUME 
BD 133 347 TH 005 952 


AUTHOR. Huberty, Carl J.; Smith, Douglas U. 

TITLE Variable Contribution in Discriminant Analysis. 

PUB DATE [Apr 76) 

NOTE 15p.; Paper presented at the Annual Meeting of the 
American Educational Research Association (60th, San 
Francisco, California, April 19-23, 1976) 


EDRS PRICE MF-$0.83 HC-$1.67 Plus Postage. 

DESCRIPTORS Classification; *Discriminant Analysis; *Predictor 
Variables; *Statistical Analysis 

IDENTIFIERS Criterion Variables 


ABSTRACT 

The purpose of this study was to determine which of 
six methods of ordering variables in a discriminant analysis yields 
subsets of variables that have the greatest discriminatory power. One 
method is based on univariate mean-square (or F) ratios, a second 
method on stepwise ordering, two methods on linear discriminant 
function (LDF) variable correlations, and two methods on standardized 
LDF coefficients. Real data on 80 graduate students in statistics 
were used. It was concluded that no single method was far superior to 
the others. Related findings are discussed, as are recommendations 
for subsequent research in this area. (Author/RC) 


RURRRERARRERE SRERE KEE BEREAARRERRESREEERAEREREREREREREREEREREEREREREREEE 
* Docurments acquired by ERIC include many informal unpublished * 
* materials not available from other sources. ERIC sakes every effort * 
* to obtain the best copy available. Nevertheless, items of marginal * 
* reproducibility are often encountered and this affects the quality * 
* of the microfiche and hardcopy reproductions ERIC makes available * 
* via the ERIC Document Reproduction Service (EDRS). EDRS is not * 
* * 
* * 
* * 


responsible for the quality of the original document. Reproductions 
supplied by EDRS are the best that can be made from the original. 
FERRE ERAREREE DERE SE RERERERERERERARERRR ERE EES ERASE RERESREARGRAAEEHERERE 


Variable Contribution in 


Discriminant Analysis 


Carl J Huberty Douglas U. Smith 


U.S DEPARTMENT OF HEALTH, 
EOUCATION 6 WELFARE 
NATIONAL INSTITUTE OF 


GOUCATION 
University of Georgia 


THIS DOCUMENT HAS BEEN REPRO- 
DUCED EXACTLY AS RECE'VED FROM 
THE PERSON OR ORGANIZATION ORIGIN. 
ATING IT POINTS OF VIEW OR OPINIONS 
STATED 0O NOT NECESSARILY REPRE- 
SENT OFFICIAL NATIONAL INSTITUTE OF 
EDUCATION POSITION OR POLICY 


Paper presented at the annual meeting of the American Educational Research 
Association, San Francisco, April 1976 


Abstract 


The purpose of this study was to determine which of six methods 
of ordering variables in a discriminant analysis yields subsets of 
variables that have the greatest discriminatory power. One method is 
based on univariate F's, a second method on stepwide ordering, two 
methods on LDF-variable correlations, and two methods on standarized 
LDF coefficients. Real data on 80 graduate students in statistics 
were used. It was concluded that no single method was far superior 
to the others. Related findings are discussed, as are recommendations 


for subsequent research in this area. 


Variable Contribution in Discriminant Analysis 


Introduction 


Variables involved in a discriminant analysis may be considered 
criterion variables (in an "experimental" or group-separation problem), 
or predictor variables (in an ex post facto a group-classification 
problem). Thus, it would be helpful to be able to rank-order these 
variables:‘in terms of their relative contribution to either group 
separation or to group classification accuracy. Such a rank ordering 
of variables would be informative for at least two reasons: (1) to 
aid in the interpretation of the discriminant analysis results for 
the data used, and (2) to discard variables for the purposes of 
subsequent research, thus lowering chances of misclassification given 
new data. S 

The problem of relative variable contribution has been studied 
from the one-group situation (Lutz, 1974), through the two-group 
situation (Cochran, 1964, Eisenbeis, Gilbert, and Avery, 1973), and to 
the more general k-group situation (Eisenbeis and Avery, 1972, Henschke 
& Chen, 1974; Huberty, 1975b). Some of these studies, and a few others, 
have explicitly attacked the related problem of variable selection (see 
Lachenbruch, 1975). The variable selection problem deals with determining 
a subset of the original set of variables of a given size the goal of 
which may be to select the subset that maximizes the differeyce between 


group mean vectors, or to select the subset that yields the greatest 


4 


classification accuracy. It /is recognized that a subset determined 
by an index of relative contribution may not be the best subset in 
either of these two senses. 

The focus of the present investigation was on the rank-ordering 
of variables with respect to the relative contribution made in 
classification accuracy. Six methods of ordering variables that have 
either been proposed or which have appeared in the literature were 
compared using real data. The purpose of the study, then, was to 
determine which of six methods is best, with "best" being defined in 
terms of the method which suggests subsets of the original set of 
variables having the greatest discriminatory power. As used in this 
study discriminatory power was assessed by the (internal/external) 
classification accuracy yielded by each subset. 

Variable Ordering Methods 

Two of the ordering methods selected for study are well known: 
(I) univariate mean-square (or F) ratios, and (II) (forward) stepwise 
discriminant analysis (BID 7:! in Dixon, 1973). Two other methods are 
intimately related to the ésxetmnaiyate employed in deriving linear 
discriminant functions (LDFs). One of these (III) involves the 
correlations between each of the variables and each of the LDFs. For 
a given variable, the squares of these correlations are summed across 
the LDFs to obtain an index for that variable. These measures, the — 


“communalities" for each variable, are only of interest when the number 


of variables is greater than one less than the number of groups 


(Cooley & Lohnes, 1971, p. 253). Another method (IV) involves the 
coefficients of each LDF that are applicable to standardized scores on 


the variables. For the ith variable a weighted composite of the 


5 


Standardized coefficients (c,,) is used; the jth weight is the 


ij 


) associated with jth LDF: G , C44) The magnitude 


eigenvalues Q, 
of this index is used to order the variables. 

Finally, special cases of methods III and IV were considered in 
light of the data used in this investigation. Very often with more 
than three groups only one LDF is worthy of study. If so, method III 
simplifies to using (absolute values of) the leading LDF-variable 
correlation (Method V). And Method IV simplifies to using the 
standardized LDF coefficients (Method VI). Methods V and VI were 
considered to determine if the inclusion a "nonsignificant" LDF in 
using, methods III and IV would substantially affect the discriminatory 
power of subsets of various sizes. 

Data Analysis 

The data used consisted of seven measures on 80 graduate students 
(Huberty & Smith, 1975). The seven measures were: age, two Graduate 
Record Examination (GRE) scores, two measures relating to undergraduate 
study in nathamatics/statistics, and two grade point averages. Group 1 
(n = 19) consisted of those students who performed at the "A" level; group 
2 (ny = 37) performed at the "B" level; and group 3 (n, = 24) performed 
at the "C" and below level. Discriptive data relative to the sample 


used in this investigation is given in Table 1. 


Insert Table 1 about here 


Initially, all seven variables were considered. Each of the ordering 
methods considered here (except for I, univariate Fs) call for the 
variables to be jointly normally distributed in the three populations, 


and for these populations to have a common covariance matrix. The constant 


6 


covariance structure was judged tenable siuce the value of Box's F 
statistic (Timm, 1975, p. 252) was less than unity. The value of 
Wilks's lawtda was 0.450 which yielded F = 4,81 with 1£ = 14/142, p< .O1 
< .01. The resulting eigenvalues were 1.077 and 0.046. 

The rank-orderings of the variables according to all six methods 


are given in Table 2. As indicated by the resulting value of the 


Insert Table 2 about here 


coefficient of concordance (W = 0.41) there is moderate agroehent 


among the orderings yielded by the six methods. Two 
pairs of rankings are of particular interest. First, it may be noted that 
consideration of the second LDF in using method III drastically modifies the 
ordering yielded by method V which considers only the leading LDF (rank- 
order correlation of -0.18). ‘Second, it may be noted that the ordering 
for method IV is identical to that indicated by method VI. In light of 
the magnitude of the second eigenvalue (0.046), this is not too surprising. 
The standardized coefficients for the second LDF ranged from 0.064 to 
2.609; the products of the second eigenvalue and these coefficients 
do not contribute a great deal (in a relative sense) to the composite, 
;- jeg -- the first product in the composite is the first eigenvalue 
(1.077) times coefficients ranging from 0.949 to 4.769 Thus five 
methods (I~IV) remained to be compared in terms of discriminatory power 
of suggested subsets of variables. 

Subsets of variatles at sizes .6, 5, 7, and 1-were specified 
according to each of the five methods. The discriminatory power of 


a subset of a given size based on each method was assessed using 


the results of a classification analysis. The classification statistic used 


is one which provides posterior probabilities of group membership and 
which uses prior probabilities of group membership (15 in Huberty, 1975a). 
A linear classification rule was employed in this study since for each 
subset of variables considered, equality of covariance matrices was 
concluded. 

Both internal and external classification results were obtained. 
The internal analysis being based on measures for those students on 
which basic statistics (mean vectors and covariance matrices) have been 
computed and then are resubstituted to obtain the values for the 


classification rules. In an external analysis statistics based on one 


set of students is used to classify "new" students. Even though a 


quadratic rule will yield greater internal accuracy when a linee. rule 
is considered appropriate, external classification based on a linear 
rule is often superior (see Huberty & Curry, 1975). The external analysis 
is essentially that suggested by Lachenbruch (1967). 
Results 
Proportions of correct. classifications yielded by the internal and 
external analyses for the six subset sizes across the five ordering 


methods are given in Table 3. The rank-ordcrings of the classification 


Insert Table 3 about here 


proportions across the six subset sizes for the five methods produced 
moderate to low coefficients of concordance for the internal (W = 0.43) 
and external (W = 0.05) analyses. Thus, for external analyses, a 
considerable discordance of classification accuracy resulted. However, 
when examining the proportions two conclusions might be drawn: (1) Method 
IV (composite of weighted coefficients) yielded the highest proportion 

for all subset sizes, with the exception of subsets of size eix. (2) For a 


8 


given subset size and- across the five methods, the proportions do not 
differ greatly; for internal classification the maximum range of 
proportions was 0.075 (subset of size 2), and for external classification 
maximum range was 0.100 (subset of size 2). 

An analysis of another data set (a three-group situation also) 
revealed that a second eigenvalue was also small relative: to the first, 
and methods IV and VI yielded nearly identical variable rank-orderings 
--the only discrepency was that the ranks of the two poorest variables 
were interchanged. As was the case for results reported in the current 
paper, the results of an analysis using the second data ‘set indicated that 
the consideration of a second (nonsignificant) LDF ‘might be expected to 
modify the ordering yielded by method V which considers only the leading LDF. 

There are some sidenotes of interest. First, proportions of correct 
internal classifications did not always increase with an increase in 
the number of variables entered into the analysis. Second, proportions 
based on an external analysis. generally increased with a decreasé in the 
number of variables entered analysis generally increased with a decrease 
in the number of variables entered, until the number decreased to one 
(Huberty & Curry, 1975). Third,. once two variables were entered into 
the analysis ‘the classification accuracy was not greatly affected, internally 
or externally, by the inclusion of. more variables. This latter result 
may be a function of the size of the variable intercorrelations. 

Discussion 
Based on the results of this preliminary investigation, to infer that 


one of the six variable ranking methods is superior to ‘the rest would 


be.folly, indeed. There simply was not (that) much of a difference in ' 


the classification accuracy across the six methods. Essentially the 


same general conclusion was reached when the second data set was 
analyzed (but not reported on here). 

An additional real data situations aed to be investigated with more 
group overlap, more criterion groups, different types of variables, and 
other envied, plus combinations of these variations. It may be 
difficult to locate real data sets having more than three groups 
and possessing some of the above variations for which the linear 
classification rule and most of the ordering methods proposed are 
appropriate. Hence, it may be desirable to conduct a lionte Carlo 
study, in which the true ordering of the variables is known, so as to 
determine which of the methods is best and which, if any, are good at all. 


Of equal, if not greater, interest is the variable ordering or selection 


problem when quadratic classification is appropriate (Lachenbruch, 1975). 


REFERENCTS 


Cochran, W. G. On the performance of the linear discriminant function. 
Technometrics, 1964, 6, 179-190. 


Cooley, W. ., & Lohnes, P. R. Multivariate data analysis. lew York: 
Wiley, 1971. 


Dixon, W. J. (Bd.). Biomedical computer programs. Berkeley, Calif.: 
University of California Press, 1973. 


Lisenbeis, . 4., & Avery, R. B. Aiscriminant analysis and classification 
procedures. Lexington, liass.: Neath, 1972. 


Lisenbeis, R. A., Gilbert, CG. G., & Avery, 2. B. Investigating the 
relative importance of individual variahles and variable subsets in 
discriminant analysis. Communications in Statistics, 1973, 2, 


205-219, 

Henschlhe, C. I.. & Chen, ‘i. Ii. Variable selection technique for classi- 
fication problems. Educational and Psychological lleasurement, 1974, 
34, 11-15. 


Huberty, C. J Discriminant analysis. P.eview of Sducational esearch, 
1975, 45, 543-590. (a). 


Huberty, C. J The stability of three indices of relative variable 
contribution in discriminant analysis. Journal of Fxperimental 
Education, 1975, 44, 59-64. (b) 


liuberty, C. J, & Curry, A. R. Linear versus quadratic multivariate class- 
ification. Paper presented at the annual meeting of the American 
Educational Research Association, Washington, April 1975. 


Huberty, C. J, * Smith, ). U. Measures of discrimination among achieve- 
ment levels in statistics. Paper presented at the annual meeting of 
the American Educational Research Association, Washington, April 1975. 


Lachenbruch, P. A. An almost unbiased method of obtaining confidence 
intervals for the probability of misclassification in discriminant 
analysis. Biometrics, 1967, 23, 639-645. 


Lachenbruch, P. A. Some unsolved problens in discriminant analysis. 
Institute of Statistics, 'iniversity of Jorth Carolina, "Wimeo Series 
Wo. 13959, December, 1°75. 


Lutz, J. G. On the rejection of Yotelhing's simple sample Tr, Educationa?. 
and Psychological lieasurement, 1974, 34, 19-23. 


Tinm, i]. hk. Multivariate analysis with applications in education and 
psycholory. Belmont, Calif.: Brooks/Cole, 1975. 


11 


Table 1 
Means, Standard Deviations*, Univariate F's 
and Within-Groups Correlation Coefficients 


Variable Croup 1 Group 2 Group 3 


No. Name mye) ngs (nyrell) op GREY GREQ. UNSH 
1 Age 28.05 31.16 33.25 3.11 .25 -.09 .02 
(4.74) (7.43) (7.13) 
2 GRE Verbal, 558.21 505.35 467.92 5.51 13 -.12 
(82,91) (84.41) (98.95) 
3. GRE Quantitative 626.84 543.95 474.50 §©21.08 .23 
(64.28) (89.49) (61.50) 
4, Undergraduate Mathematics/ 21.84 15.35 10.88 2,54 
Statistics Hours (16.08) (16.69) (14.25) 
5. Number Years Since Last 6.16 10.22 12,04 4 yy 
Mathematics/Statistics Course (4.06) (7.33) (6.77) 
6. Undergraduate GPA 3.33 2.99 2.81 7.66 
(0.54) (0.38) (0.41) 
7; Graduate GPA 3.75 3.72 3.51 4.58 
12 (0.34) (0.28) (0.32) 


8given in parentheses. 


YCMS 
79 


025 


-.18 


~.28 


UGPA 
-.09 


20 


-.24 


-.12 


-.15 


GGPA 
-.06 


15 


02 


-.11 


-00 


16 


Table 2 


Rank-Orderings of Variables 


Method 
I Il III Iv Vv VI 
(F's) (Stepwise) (Communalities) (Yeighted (r's) (Coefficients) 
Coof? icLonts) 
Best 3 3 7 3 3 a 
6 6 3 6 6 6 
2 7 6 1 2 1 
7 4 5 4 5 4 
5 1 2 5 7 5 
1 2 1 2 1 2 
Poorest 4 5 4 7 4 7 


14 


Proportions of Correct Classifications® 


No. Variables 


in Subset 


6 Internal 
External 
5 Internal 
External 
4 Internal 
External 
3 Internal 
External 
2 Internal 
External 
1 Internal 
External 


®Decimals are omitted. 


Table 3 


Method 
I II 
650 663 
613 588 
650 675 
600 600 
662 683 
600 600 
675 688 
625 650 
683 638 
675 675 
513 513 
488 483 


Leena ree 


650 


613 


650 
660 


638 


613 


675 
625 


688 


675 


513 


488 


Maximum 
_ Difference 


063 
025 


025 
000 


062 


038 


025 
025 


075 


100 


025 


013 


